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Abstract 

Next generation sequencing is quicl<ly emerging as tlie go-to tool for plant virologists when sequencing whole virus 
genomes, and undertaking plant metagenomic studies for new virus discoveries. This study aims to compare the genomic 
and biological properties of Bean yellow mosaic virus (BYIVIV) (genus Potyvirus), isolates from Lupinus angustifolius plants with 
black pod syndrome (BPS), systemic necrosis or non-necrotic symptoms, and from two other plant species. When one Clover 
yellow vein virus (CIYW) (genus Potyvirus) and 22 BYIVIV isolates were sequenced on the lllumina HiSeq2000, one new CIYVV 
and 23 new BYIVIV sequences were obtained. When the 23 new BYMV genomes were compared with 17 other BYMV 
genomes available on Genbank, phylogenetic analysis provided strong support for existence of nine phylogenetic 
groupings. Biological studies involving seven isolates of BYIVIV and one of CIYVV gave no symptoms or reactions that could 
be used to distinguish BYIVIV isolates from L angustifolius plants with black pod syndrome from other isolates. Here, we 
propose that the current system of nomenclature based on biological properties be replaced by numbered groups (l-IX). 
This is because use of whole genomes revealed that the previous phylogenetic grouping system based on partial sequences 
of virus genomes and original isolation hosts was unsustainable. This study also demonstrated that, where next generation 
sequencing is used to obtain complete plant virus genomes, consideration needs to be given to issues regarding sample 
preparation, adequate levels of coverage across a genome and methods of assembly. It also provided important lessons 
that will be helpful to other plant virologists using next generation sequencing in the future. 
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Introduction 

Next generation sequencing (NGS) technologies are fast 
becoming a popular method to obtain whole plant virus genomes 
in a relatively short period of time [1]. Their uptake by plant 
virologists has been slower than by their counterparts in the 
medical sciences where the applications are extending much 
further, rapidly approaching the concept of personalized medicine. 
Such a situation was impossible before the advent of NGS and its' 
rapid evolution into an affordable and accessible technology now 
appearing on laboratory bench-tops throughout the world [2,3]. 
Because of the ability to use total RNA extractions for NGS, it is 
becoming increasingly common to use it to sequence complete 
genomes of plant viruses and still obtain excellent results [4—9] . 
The challenge now lies not in accessing and using NGS 
technology, but in analyzing and interpreting the very large 
datasets suddenly at our disposal [1]. 

Bean yellow mosaic virus (BYMV) (family Potyviridae, genus 
Potyvirus) is a single stranded positive sense RNA virus that occurs 
worldwide. It is a virus with an extensive natural host range that 
encompasses monocots and dicots, and both domesticated and 



wild plant species [10,11]. It is transmitted non-persistently by 
many different aphid species [12]. BYMV causes serious diseases 
and losses in many cultivated plant species worldwide. For 
example, early BYMV infection, which causes serious losses, 
normally results in systemic necrosis and plant death [13-15]. In 
contrast, late infection with BYMV causes black pod syndrome 
(BPS) in Lupinus angustifolius (narrow-leafed lupin) also resulting 
in damaging losses [16]. Plants with BPS develop characteristic 
flat, black pods that have littie or no seed [17]. It seems likely that 
both the BPS and systemic necrosis responses are related to 
presence of hypersensitivy Nhm-1 gene and another similar 
resistance gene [15,18-20]. 

Wylie et al. [21] provided evidence for existence of seven 
BYMV phylogenetic groupings based on coat protein (CP) 
sequences and the original hosts of the isolates sequenced: one 
generalist group with a broad host range including monocots and 
dicots called the general group, and six other speciahst groups each 
named after the original hosts of the isolates within them (broad 
bean, canna, lupin, monocot, pea, W). Partial CP sequences from 
BYMV isolates originally from L. angustifolius plants with BPS, 
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systemic necrosis or non-necrotic symptoms placed all of them into 
the general group [16,21]. 

This study aims to compare the genomic and biological 
properties of BYMV isolates from L. anguslifolius plants with 
BPS, systemic necrosis or non-necrotic symptoms, and from two 
other plant species. NGS was used to sequence 22 BYMV isolates, 
obtained as part a study conducted in 2011 and from previous 
studies in south-west Australia [16,19]. Here, we present the 
results of genome comparisons with th(; resulting 23 new BYMV 
genomes and one Clover yellow vein virus (ClYW) genome with 
17 genomes retrieved from Genbank, and biological host range 
studies with seven BYMV and one ClYW isolates. We also make 
recommendations based on the lessons learned from our NGS 
studies which will be useful to plant virologists employing this 
approach to obtain whole genomes of other plant viruses. 

Materials and Methods 

Isolates and host plants 

Seventeen BYMV isolates were collected from L. angustifolius 
plants with BPS (i.e. systemic necrotic stem streaking with black 
pods) (11) and systemic necrosis (no black pods) (6), and two from 
L. cosenlinii plants with mosaic and leaf deformation as part of a 
2011 study in south-western Australia [16]. The remaining three 
BYMV isolates (FB, LMBNN and LP) were from previous studies 
[19]. They had been maintained as freeze-dried leaf material 
obtained from the West Australian Plant Pathogen Culture 
Collection (FB - WAC 10051, LMBNN - WAC 10094 and LP - 
WAG 10059). The ClYW isolate was from the same culture 
collection (WAC10102). 

All plants were maintained at 18-22°G in an insect-proof, air 
conditioned glasshouse. Plants of L. angustifolius cvs JenabiUup 
(partially resistant to BPS), Mandelup (susceptible to BPS) and 
germplasm accession P26697 (Nhm-1 gene absent) were grown in 
washed river sand. Plants of Nicotiana benthamiana, Trifolium 
subterraneum cv. Woogenellup (subterranean clover), Chenopodi- 
um armranticolor, C. quinoa, Pisum sativum cv. Greenfeast (pea) 
and Vicia faba cv. Coles early dwarf (faba bean) were grown in 
steam-sterilised potting mix. Cultures of virus isolates were 
maintained by serial mechanical inoculation of infective sap to 
plants of A', benlhamaniana or T. subierraneum. For inoculations 
to maintain cultures, or as part of experiments, virus-infected 
leaves from systemicaUy infected plants were ground in O.IM 
phosphate buffer, pH 7.2, and the infective sap mixed with celite 
before being rubbed onto leaves. 

For testing by ELISA, leaf samples were extracted (1 g per 
20 ml) in phosphate-buffered saline (10 mM potassium phosphate, 
150 mM sodium chloride, pH 7.4, Tween 20 at 5 ml/hter, and 
polyvinyl pyrrolidone at 20 g/liter) using a mixer mill (Retsch, 
Germany). Sample extracts were tested for BYMV or ClYW by 
double-antibody sandwich ELISA based on a modified protocol 
described by Clark and Adams [22] and according to manufac- 
turer's recommendations. For generic Potyvirus testing, samples 
were extracted in 0.05 M sodium carbonate buffer, pH 9.6, and 
tested using the antigen-coated indirect ELISA protocol of 
Torrance and Pead [23]. The polyclonal antiserum to BYMV 
was from DSMZ (AS-0717), Germany, to ClYW from Neogen 
Phytodiagnostics - formerly Adgen, UK (1171-05) and to generic 
potyvirus from Agdia, USA (SRA27200). All samples were tested 
in duplicate wells in microtiter plates. Sap from BYMV or ClYW 
infected and healthy T. subterraneum leaf samples was included in 
paired wells to provide positive and negative controls. The 
substrate was j!)-nitrophenyl phosphate at 1.0 mg/ml in dietha- 
nolamine, pH 9.8, at 100 ml/liter. Absorbance values at A405 



were measured in a microplate reader (Bio-Rad laboratories, 
USA). Absorbance values of positive samples were always more 
than three times those of the healthy sap control. 

Sequence data 

Twenty two BYMV and one ClYW sample were sent for NGS 
on an Illumina HiSeq 2000 (Table 1). For BYMV in total diere 
were 1 1 samples from L. angustifolius plants with BPS, six from L. 
angustifolius plants with systemic necrosis and one from a L. 
angustifolius plant with non-necrotic symptoms. The remaining 
samples consisted of isolates from other Lupinus spp. or were 
isolates from other hosts representing other phylogenetic groups 
based on Wylie et al. [21], including two samples from L. 
cosentinii, one from L. pilosus, and one from V. faba. The single 
ClYW sample was from T. repens (white clover). Total RNA was 
extracted from each sample using a Spectrum Plant Total RNA kit 
(Sigma-Aldrich, Australia). Following extraction, total RNA was 
sent to the Australian Genome Research Facility (ACRE) for 
hbrary preparation and barcoding (24 samples per lane) before 
100 bp paired-end sequencing on an Illumina HiSeq2000. For 
each sample, reads were first trimmed using CLC Genomics 
Workbench 6.5 (CLCGW) (CLC bio) with the quality scores limit 
set to 0.01, maximum number of ambiguities to two and removing 
any reads with <30 nucleotides (nt). Contigs were assembled using 
the de novo assembly function of CLCGW with automatic word 
size, automatic bubble size, minimum contig length 500, mismatch 
cost two, insertion cost three, deletion cost three, length fraction 
0.5 and similarity fraction 0.9. Contigs were sorted by length and 
the longest subjected to a BLAST search [24]. In addition, reads 
were also imported into Geneious 6.1.6 (Biomatters) and provided 
with a reference sequence obtained from Genbank (JXl 73278 for 
BYMV and NC003536 for ClYW). Mapping was performed with 
minimum overlap 10%, minimum overlap identity 80%, allow 
gaps W/a and fine tuning set to iterate up to 10 times. A consensus 
between the contig of interest from CLCGW and the consensus 
from mapping in Geneious was created in Geneious by alignment 
with Clustal W. Open reading frames (OREs) were predicted and 
annotations made using Geneious. Finalized sequences were 
designated as "complete" based on comparison with the reference 
sequences used in the mapping process, "nearly complete" if some 
of the 5' or 3' UTR was missing but the coding region was intact, 
and "partial" if aU of the 5' or 3' UTR and some of the PI or CP 
genes were missing. 

Phylogenetic analysis 

The new sequences were aligned with the 1 7 retrieved from 
Genbank using Clustal W in MEGA 5.2.1, prior to phylogenetic 
analysis [25]. Phylogenetic analysis compared (i) coding regions of 
all BYMV genome sequences and (ii) coding regions of all BYMV 
genome sequences except seven with average coverage of 10 times 
or less. N(-iglibor-joining trees were made using the number of 
differences model with a bootstrap value of 1000, Maximum 
Likelihood trees using the Tamura-Nei model with a bootstrap 
value of 1 000, and Minimum Evolution trees using the number of 
differences model with a bootstrap value of 1000. Tables of 
nucleotide (nt) percentage differences were calculated for the 
complete genomes using the pairwise comparison function with 
the number of differences model. Final sequences were submitted 
to the European Nucleotide Archive (ENA) with accession 
numbers HG970847-HG970870 (Table 1). 

Biological data 

For host range studies, seven isolates of BYMV and one of 
ClYW were mechanically inoculated onto leaves of L. angusti- 
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folius, N. benthamaniana, T. subterraneum, C. amaranticolor, C. 
quinoa, P. sativum and V. faba plants (5 plants/isolate). For each 
experimental host, uninoculated and mock-inoculated controls 
were included at time of inoculation (five plants each). There were 
five isolates fi-om L. angustifolius, one fi'om a plant with BPS 
(AR93C), three fi-om plants with systemic necrosis (MD5, GB17A 
and ESllA), and one fi-om a plant with non-necrotic symptoms 
(LMBNN). The remaining isolates were fi-om plants of L. 
cosentinii (MD7) and L. pilosus (LP) with non-necrotic symptoms. 
Symptoms were recorded and samples from inoculated and tip 
leaves tested by ELISA weekly beginning 7 days after inoculation 
for up to six weeks. 

Results 

Sequence data 

From the single C1Y\''\' and 22 BYMV samples, the numbers of 
raw reads obtained from NGS were 10,841,138-31,131,660, but 
these numbers were reduced to 10,582,250-29,877,478 after 
trimming (Table 1). Following de novo assembly of each individual 
sample using CLCGW, the numbers of contigs produced were 
149-2498. Contig of interest lengths were 534-9,655 nt with 
average coverage 3-10,173 times and the numbers of reads 
mapped to each contig \\'cr(; 18-987,972. After mapping to a 
reference genome in Geneious, the lengths of the consensus 
sequences were 9,034—10,324 nt, with average coverage of 4— 
12,313 times and the numbers of reads mapped to the references 
sequence were 471-1,002,513. Final sequence lengths consisted of 
the consensus of the contig from GLGGW and the consensus from 
Geneious, and were 9,274-9,530 nt. All samples yielded one 
sequence of interest, with the exception of FB, which contained a 
second BYMV sequence which we called "LPexFB". In all cases, 
except for ClYW, the contigs of interest were most closely related 
to BYMV after being sulyc-cted to Blastn analysis. ClYW was 
most closely related to the only other ClYW complete genome 
available on Genbank. In total, there were nine complete 
genomes, ten nearly complete genomes (including ClYW) and 
five partial genomes. 

Phylogenetic analysis 

Phylogenetic analysis comparing the coding regions of 23 new 
complete or nearly complete BYMV genomes and one new nearly 
complete ClYW genome with those of 17 BYMV and one 
ClYW genome retrieved from Genbank provided 100% boot- 
strap support for eight of nine phylogenetic groups (I, II, IV-IX). 
The remaining group (III) had 98'/o bootstrap support. Seven of 
the new genomes had average coverages of less than or equal to 
ten times (MD5, MD6, GB42C, ES69C, ES67C, PN77C and 
AR98C) and five of these (MD5, MD6, ES67C, ES69C and 
PN77C) did not sit well within groups I and II. Although they 
appear to belong to them, genomes such as MD6 and PN77C sit 
out on their own, almost separate; from the other sequences, 
leaving groups I and II poorly resolved (Figure la). In contrast, 
when sequences of the seven genomes with poor average coverage 
(SlO times) were removed, phylogenetic analysis gave the same 
results but with much greater resolution between groups I and II 
and improved bootstrap support for groups I-IX (Figure lb). 
Those removed were designated as "draft" genomes because all 
had low coverage and/or small gaps. When all the genomes, 
including those with poor coverage were analyzed using Maxi- 
mum Likelihood or Minimum Evolution methods, the tree 
topologies shown were the same as the Neighbor-Joining method. 

The range of original isolation hosts within each grouping 
varied (Table 2). Group I consisted of nine sequences from two 



dicot, and two monocot species. Group II consisted of seven 
sequences from two dicot and one monocot species. Group III 
consisted entirely of three sequences from one monocot species. 

Group IV was made up of three sequences from an unknown 
original host or hosts, as well as two from a monocot and one from 
a dicot species. Groups V-IX consisted entirely of dicot species 
belonging to a single family, and were represented by up to three 
sequences. AH dicot species were from families Fabaceae or 
Gentianaceae, and aU monocot species were from famihes 
Orchidaceae or Iridaceae. 

Sequence analysis 

When the coding regions of the 1 6 new BYMV genomes (draft 
genomes excluded) and one ClYA'V genome were analyzed 
against those retrieved from Genbank, the nt percentage identities 
within each phylogenetic group were ^96.6'/o (I), ^98.6yo (II), ^ 
93.9% (III), >94% (IV), >90.7% (V), >99.8% (VI), >97.6% (VII) 
and S97.5% for ClYW (Table SI). When the six sequences from 
L. angustifolius plants with BPS were compared to each other 
their percentage nt identites were S93.8%. When the sequences 
from all L. angustifolius plants were compared to each other their 
percentage nt identities were also S93.8%. Across all 33 BYMV 
sequences used in this analysis the nt identities were ^75.6%. 
When the ClYW sequences were compared to the BYMV 
sequences, overall the percentage nt identities were 66.4—67.9%. 

Biological data 

AU seven BYMV isolates and one ClYW isolate inoculated to 
plants caused systemic symptoms of varying severity in N. 
benthamiana, T. subterraneum and V. faba (Table 3). However, 
apart from ClYW and BYMV isolate GB17A in V.faba, none of 
them induced systemic necrotic symptoms, which were severe only 
with ClYW. In C. amaranticolor, ClYW and five BYMV isolates 
caused obvious systemic symptoms, while infection was restricted 
to inoculated leaves with the isolate originally from L. angustifo- 
lius plants with BPS and another originally from an L. 
angustifolius plant with non-necrotic symptoms. In C. quinoa, 
although all isolates infected inoculated leaves, only ClYW 
caused systemic invasion. In contrast, in P. sativum, only BYMV 
isolate LP caused any infection. 

In L. angustifolius cvs Jenabillup and Mandelup, three BYMV 
isolates caused systemic non-necrotic symptoms. These were 
originally from plants of this species with non-necrotic symptoms 
(LMBNN) or systemic necrotic symptoms (ESllA), and L. 
cosentinii (MD7) from a plant with mosaic and leaf distortion. 
AU other BYMV isolates and the ClYW caused systemic necrotic 
symptoms in cvs JenabiUup and Mandelup. In accession P26697, 
with ClYW and four BYMV isolates for which symptom data are 
available, the reactions resembled those in cv. Jenabillup, with the 
exception of MD5 which produced severe mosaic (i.e. non- 
necrotic) symptoms instead of systemic necrosis. Isolates LBMNN 
and ESI lA caused non-necrotic symptoms, while ClYW and LP 
caused systemic necrosis. FaUure of isolates AR93C and MD7 to 
infect P26697 probably represents escapes, but there was no seed 
left of P26697 for further testing. Isolate LP did not infect L. 
angustifolius cv. Mandelup on two separate occasions by sap 
inoculation, but further inoculations using grafting or aphids 
would be needed to establish if this is a resistance reaction. 

Discussion 

Before this study was conducted, there were only 17 complete 
BYMV genomes on Genbank. The ten complete and eight nearly 
complete genomes from this study doubled avaUable BYMV 
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Figure 1 . Neighbor-joining relationship phylograms obtained from alignment of the coding regions of Bean yellow mosaic virus 
(BYIWV) genomes. The alignments were generated in iVlEGA 5.2.1 using ClustalW and tree branches were bootstrapped with 1000 replications. The 
trees were rooted with a sequence of Clover yellow vein virus (CIYVV), the closest relative to BYIVIV. New isolates from this study shown in grey, isolates 
obtained from Lupinus angustlfolius plants with BPS are denoted by * and isolates with genomes designated as "draft" are denoted by +. a) Complete 
coding regions of BYIVIV genomes, including draft sequences, with isolates retrieved from Genbank. b) The same sequences as in a) but with draft 
sequences removed from the analysis. 
doi:1 0.1 371 /journal.pone.01 04580.g001 



Table 2. Original hosts of isolates within each phylogenetic grouping. 





Phyiogenetlc group 
(old name) 


Accession numbers 


Dicot 


Monocot 


1 (general) 


FJ492961, JX173278, HG970847, HG970851, HG970851-52, 
HG970856-57, HG970860-62, HG970864-65, HG970865 


Lupinus angustifoiius^ (6)^, 
L cosentinii (1) 


Diuris magnifica (1), 
Freesia sp. (1) 


II (general) 


JX1 56423, HG970848, HG970850, HG970854-55, 
HG970858-59, HG970863 


L angustlfolius (5), L. cosentinii (1) 


Diuris sp. (1) 


III (monocot) 


AB079886, AB079887, AB439729 




Gladiolus hybrid (3) 


IV (general) 


AB079888^ 083749^ NC003492', AB439730, AM884180, 
AY1 92568 


Eustoma russellianum (1), 


Gladiolus sp. (1), 
Gladiolus hybrid (1) 


V(faba bean) 


AB439732, U47033 


Trifolium pratense (1), Vicia faba (1) 




VI (lupin) 


HG970866, HG970868 


L pilosus{^), Vicia faba (1) 




VII (faba bean) 


AB439731, HG970867 


v. faba (2) 




VIII (W) 


DQ641 248 


L albus (1) 




IX (pea) 


AB373203 


Pisum sativum (1) 





^Species from Lupinus, Vicia and Trifolium are from family Fabaceae. Eustoma is from family Gentianaceae, Gladiolus and Freesia are from family tridaceae, Diuris is from 
family Orchidaceae. 

"^Numbers in parentheses represent the numbers of genomes with from this original isolation host. 
'^Denotes an unknown original host for that accession number. 
doi:1 0.1 371 /journal.pone.01 04580.t002 
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genomic data in the database. Moreover, the five additional partial 
genomes we obtained will be useful in future studies. Our genome 
results enabled the phylogenetic makeup of BYMV to be 
examined thoroughly, revealing presence of nine distinct groups, 
including the subdivision of the former generalist group into three 
new groups. We recommend replacing the phylogenetic groupings 
of Wylie et al. [21] with numbered group names (I-IX). We have 
not included one former specialist group based on CP genes, the 
canna group, in our analysis because it was not represented by any 
whole genome sequence. Use of whole genomes revealed that the 
previous phylogenetic grouping .system based on partial genom(" 
sequences and original isolation hosts was unsustainable. This is 
because genome sequences from broad bean are present in two 
former specialist groups (now V and VII), from various Lupinus 
species in two former speciahst groups (now VI and VIII), and two 
former generalist groups (now I and II). Moreover, although we 
have not re-analyzed sequences of CP genes, Wylie et al. [2 1] had 
previously placed a CP sequence from the dicot species Eustoma 
russellianum (family Gentianaceae), in the former monocot group 
(now III). Numbering of groups prevents such confiision arising 
from use of natural isolation host names. Our results highlight the 
importance of using complete genomes wherever possible to define 
phylogenetic groupings. The results also highlight the need for 
further sequencing and analysis of BYMV isolates likely to belong 
to former specialist phylogenetic groupings, which will provide 
greater insight into the genetic makeup of BYMV. 

Close examination of the nt percentage sequence identities 
betwcc-n BYM\' and C1\'A'V genomes revealed that the diver- 
gence between them is greater than previously thought. Overall, 
BYMV percentage nt identities ranged from 75.6 to 99.5%. The 
species demarcation for potyviruses is currently 23-24% diver- 
gence at the nt level [26], and some of the BYMV isolates 
compared came close to this. The two ClYW genomes shared 
97.5% nt identity, but when compared to all the BYMV genomes, 
nt identiti(;s were 66.4—67.9%, well beyond the species demarca- 
tion point for potyviruses. ClYW was originally considered an 
isolate of BYMV but was later shown to be a distinct virus 
[26,27,28]. Our percentage identities support that distinction. 
However, some BYMV phylogenetic groups were more closely 
related to ClYW than others. For example when compared with 
all other BYMV sequences, the single sequences from groups VIII 
and IX had percentage identities of just 78.4—79.8% and 75.6- 
76.9% to BYMV respectively, whereas when compared to ClYW 
their nt percentage identities were 67.0-67.7% (Table SI). Again, 
further genome sequences from these groups and ClYW are 
required for a more conclusive analysis. 

Based on our phylogenetic and sequence analyses, BYMV 
isolates associated with BPS in L. angustifolius were not different 
phylogeneticaUy from other BYMV isolates we sequenced from L. 
angustifolius, L. cosentinii, or other hosts within the same 
phylogenetic groups (I and II). Also, from the host data from 
our inoculations, there was no host reaction that could be used to 
distinguish a particular isolate as causing BPS. However, there 
were some other interesting differences. Although isolate ESllA 
behaved in a similar manner to isolate LMBNN, which overcomes 
the Nbm-1 hypersensitivity gene in L. angustifolius plants [19,20], 
it was isolated from a L. angustifolius plant originally displaying 
systemic necrosis. ClYW behaved like isolate LP, but whether 
ClYW interacts with both Nhm-1 and the second putative 
BYMV hypersensitivity genes, or unknown ClYW-specitic genes 
in L. angustifolius, is not clear [19]. ClYW and all group I and II 
isolates failed to infect P. sativum cv. Greenfeast although the 
group VI isolate LP did cause infection. This may be due to the 
fact that this cultivar, like many commercial pea cultivars, may 



contain the BYMV resistance gene mo and ClYW resistance 
genes cyv or cyv-2 [29,30] and their responses are strain specific. 
Induction of severe necrotic symptoms in V. faha by ClYW but 
not the BYMV isolates is expected, as this is the classical method 
for distinguishing BYMV from ClYW [10,20]. 

In this study, we used NGS to obtain complete virus genomes 
and it proved both an advantage and a disadvantage over 
traditional sequencing methods. It allowed large amounts of data 
to be generated quickly, but analysis of the data proved a major 
chalk-nge. Many free programs exist for the assembly of NGS data 
(e.g. Vc'K'(;t, SOAP de novo, Abyss and bowtie) but they all require 
the researcher to be proficient in the use of command line driven 
applications. As so-called "benchtop biologists", the use of 
Geneious and CLCGW was easy to learn and their cost was 
acceptable in view of the time saved in learning the use of 
command line driven programs. That said, our success was 
probably attributable to the small genome sizes of plant viruses, 
particularly BYMV and ClYW, which are both c. 9535 nt long. 
Larger genomes, from unpurified RNA samples would undoubt- 
edly be much harder to piece together, but not impossible. We 
found in most cases (17 out of 23) there was suflGcient average 
coverage to be confident of good genome repr(;sentation for the 
isolate sequenced. These sequences had av(;rage c()\'erages as low 
as 65 and 457 with remaining average coverages being greater 
than 737 and up to 12,313 times when mapped back to a reference 
sequence using Geneious. Currendy, sequencing a human genome 
of approximately 300 MB on an lUumina platform requires 30 
times coverage to be adequate [31]. Therefore, it seems reasonable 
to designate our virus genomes with less than 30 times coverage as 
draft sequences. Although not meeting minimum requirements for 
average coverage, they are stiU valuable data sets, particularly 
given the low numbers of complete or nearly complete BYMV 
genomes available (now 32 including those from this study). 

The settings used in de novo assembly are sufficient to 
distinguish between more than one strain or group of a plant 
virus when present in the same sample, as previously demonstrated 
by Kehoe et al. [9] . In our case, the sample from a V. faba BYMV 
isolate (FB) retrieved from the culture collection also contained a 
nearly complete LP isolate genome. The contamination probably 
occurred more than ten years ago when they were maintained 
next to each other in the same glasshouse prior to freeze-drying 
and storage in the collection. In such instances, if we had only 
been using Geneious to map to a reference genome, we would 
have likely missed the second sequence. It is therefore important to 
perform de novo assembly, as well as mapping to a reference 
genome. In cases where either the mapping or the de novo 
sequence had a gap, it was usually resolved after alignment with 
the sequence from the second program. However, for genomes 
with coverage less than ten (i.e. the draft genomes) this method was 
ineffective. 

The uptake of NGS amongst plant virologists is increasing as 
the [:ost associated with it decreases [1]. The relatively small 
genome size of plant viruses allows us the opportunity to extract 
complete or nearly complete genomes using commercial packages. 
Use of NGS does raise concerns regarding the consequences of an 
increase in the discovery of virus or virus-like sequences. As such, 
MacDiarmid et al. [32] made recommendations regarding the 
identification of plant viruses through NGS, and the potential 
biosecurity issues associated with this. One of the recommenda- 
tions was that the term "uncultured virus" should be used with any 
plant virus sequence not associated with a recognized virus 
infection. We support this recommendation whole-heartedly. 

We know of no recommendation regarding requirements for 
depth of coverage for plant virus genomes, particularly ones 
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involving new virus discoveries. Until such time as an appropriate 
set of comparative studies are done, we would recommend 
following in the path of our human genome colleagues by 
requiring a minimum coverage of at least 30 times, but this would 
likely lead to many nearly complete or draft plant virus genomes. 
However as with BYMV for example, we required coverage well 
into the lOOO's to ensure a complete genome (including 5' and 3' 
UTRs, a constant challenge for plant virologists). Our samples sent 
for sequencing were total RNA, so different methods of sample 
preparation might have increased the numbers of virus reads. For 
example, use of subtractive hybridization [4], or extracting for 
dsRNA first, followed by random cDNA synthesis [1,33]. Despite 
this, there is no doubt that NGS has been an exceedingly useful 
tool for our study. 



References 

1. Boonham N, Kreuze J, Winter S, van der Vlugt R, BergervoetJ, et al. (2014) 
\lcthods in virus diagnostics: from ELISA to next generation sequencing. Virus 
Res 186: 20-31. 

2. Mardis ER (2013) Next-generation sequencing platforms. Aim Rev Analyt 
Chem 6: 287-303. 

3. Koboldt DC, Steinberg KM, Larson DE, Wilson RK, Mardis ER (2013) The 
Next-generation sequencing revolution and its impact on genomics. Cell 155: 
27-38. 

4. Adams IP, Glover RH, Monger WA, Mumford R, Jackcvicicnc E, rt al. (2009) 
Ncxt-gcncration sequencing and mctagcnomic analysis: a universal diagnostic 
tool in plant \'irolo^g)'. Mol Plant Pathol 10: 537-545. 

5. Roossinck AIJ (2012) Plant virus mctagcnomics: biodiversity and ecology. Ann 
Rev Gen 46: 359-369. 

6. WyUe SJ, Luo H, Li H, Jones MGK (2012) Multiple polyadenylated RNA 
viruses detected in pooled cultivated and wild plant samples. Arch Virol 157: 
271-284. 

7. WyHe SJ, Li H, Sivasithamparam K, Jones MGK (2014) Complete genome 
analysis of three isolates of narcissus late season yellows virus and two of 
narcissus yellow stripe virus: three species or one? Arch Virol 159: 1521—1525. 

8. Jones RAC (2014) Trends in plant virus epidemiology: opportunities from new 
or improved technologies. Virus Res 186: 3-19. 

9. Kehoe MA, Coutts BA, Buirchell BJ, Jones RAC (2014) Ilardenbergia mosaic 
virus: crossing the barrier between native and introduced plant species. Virus 
Res 184: 87-92. 

10. Bos L (1970) Bean yellow mosaic virus. Association of Applied Biologists. 
Descriptions of Plant Viruses No. 40. 

11. Edwardson JR, Christie RG (1991) Bean yellow mosaic virus. In CRC 
Handbook of viruses infecting legumes ppl37— 148. CRC Press, Boca Raton, PL. 

12. Berlandier FA, Thackray DJ, Jones RAC, Latham LJ, Cartwright L (1997) 
Determining the relative roles of ditfcrent aphid species as vectors of cucumber 
mosaic and bean yellow mosaic viruses in lupins. Ann Appl Biol 131: 297—314. 

13. Jones RAC, McLean GD (1989). Virus diseases of lupins. Ann Appl Biol 114: 
609-637. 

14. Jones RAC (2001) Developing integrated disease management strategies against 
no n- persistently aphid-borne viruses: A model programme. Integ. Pest Man. 
Reviews 6: 15-46. 

15. Jones RAC, Coutts BA, Cheng Y (2003) Yield limiting potential of necrotic and 
non-necrotie strains Bean yellow mosaic virus in narrow-leafed lupin [Lupinus 
angustif alius). AustJ Agric Res 54: 849-859. 

16. Kchoe MA, Buirchell BJ, Coutts BA, Jones RAC (2014) Black pod syndrome of 
Lupinus angustifolius is caused by late infection with Bean yellow mosaic virus. 
Plant Dis 98: 739-745. 

17. Buichell B J (2008) Narrow-leafed lupin breeding in Australia — where to from 
here? Pages 226-230 in: Lupins for Health and Wealth, Proceedings of the 12''' 
International Lupin Conference, 14-18 September 2008, Fremantle, Western 
Australia. JA . Palta and JB . Berger editors. International Lupin Association, 
Canterbury, New Zealand. 



Supporting Information 

Table SI Nucleotide percentage similarities of the c:oding 
regions of thirty three Bean yellow mosaic virus and two Clover 
yellow vein vims isolates, calculated in MEGA 5.2.1 using a 
pairwise comparison with the number of differences model. 

(DOCX) 

Acknowledgments 

We thank E. Gajda, S. Vincent and M. Banovic for glasshouse and 
laboratory support. 

Author Contributions 

Conceived and designed the experiments: MAK BAC BJB R^ACJJ. 
Performed the experiments: MAK. Analyzed the data: MAK. Contributed 
reagents/materials/analysis tools: MAK. Contributed to the writing of the 
manuscript: MAK BAG BJB RACJ. 



18. Cheng Y, Jones RAC (1999) Distribution and incidence of necrotic and non- 
necrotie strains of bean yeUow mosaic virus in wild and crop lupins. AustJ Agric 

Res 50: 589-599. 

19. Cheng Y, Jones RAC (2000) Biological properties of necrotic and non-necrotie 
strains of bean yellow mosaic virus in cool season grain legumes. Ann Appl Biol 
136: 215-227. 

20. Jones RAC, Smith LJ (2005) Inheritance of hypersensitive resistance to Bean 
yellow mosaic virus in uEurow-lcEifed lupin {Lupinus angustifolius). Ann Appl 
Biol 146: 539-543. 

21. Wylie SJ, Coutts B.A, Jones MGK, Jones RAC (2008) Phylogenetic analysis of 
Bean yellow mosaic virus isolates from four continents: relationship between the 
seven groups found and their hosts and origins. Plant Dis 92: 1596-1603. 

22. Clark MF, Adams AN (1977) Characteristics of the microplate method of 
enzyme-linked immunosorbent assay for the detection of plant viruses. J Gen 
Virol 34: 475-483. 

23. Torrance L, Pead MT (1986) The application of monoclonal antibodies to 
routine tests for two plant viruses, pp 103—118 in: Developments in applied 
biology. 1 Developments and applications on virus testing. Jones RAG and 
Torrance L, editors. Association of Applied Biologists, Wellesboume, UK. 

24. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local 
alignment search tool. J Mol Biol 215: 403-410. 

25. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, et al.(2011) MEGA5: 
Molecular evolutionary genetics analysis using maximum likelihood, evolution- 
ary distance, and parsimony methods. Mol Biol L\ol 28: 2731-2739. 

26. Adams MJ, AntoniwJF, Pauquet CM (2005) Molecular criteria for genus and 
species discrimination within the family Fotyviridae . Arch Virol 150: 459-479. 

27. Uyeda 1, Takahasi 1\ Shikata L (1991) Relatedncss of the nucleotide sequence of 
the 3'-terminal region of clover vellow vein potyvirus RNA to bean yellow 
mosaic pot\-\'irus RNA. Intcrvirol 1991: 234—245. 

28. HoUings M, Stone OM (1974) Clover yeUow vein virus. Association of Applied 
Biologists. Descriptions of Plant Viruses No. 131. 

29. Schroeder WT, Prowidenti R (1971) A common gene for resistance to Bean 
yellow mosaic virus and Watermelon mosaic virus 2 in Pisum sativum. 
Phytopathology 61: 846-847. 

30. Prowidenti R (1987) Inheritance of resistance to clover yellow vein virus in 
Pisum sativum.] Heredity 70: 126-128. 

31. Wetterstrand KA (2014) DNA Sequencing Costs: Data from the NHGRI 
Genome Sequencing Program (GSP) Available: www.genome.gov/ 
sequencingcosts. Accessed 2014 Feb 1. 

32. MacDiarmid R, Rodom B, Melchcr U, Ochoa-Corona F, Roosinck M (2013) 
Biosecurity implications of new technology and discovery in plant virus research. 
Plos Padiogens 9: elD03337. 

33. Al Rwahnih M, Daubert S, tlrbez-Torres JR, Cordero F, Rowhani A (2011) 
Deep sequencing evidence from single grapevine plants reveals a virome 
dominated by mycoviruses. Arch Virol 156: 397-403. 



PLOS ONE I www.plosone.org 



8 



August 2014 I Volume 9 | Issue 8 | e104580 



